[BugFix] Abort BE vacuum tasks once the FE caller's timeout elapses#74694
Open
starrocks-xupeng wants to merge 2 commits into
Open
[BugFix] Abort BE vacuum tasks once the FE caller's timeout elapses#74694starrocks-xupeng wants to merge 2 commits into
starrocks-xupeng wants to merge 2 commits into
Conversation
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9620972e97
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
The FE gives up waiting for a vacuum RPC after its brpc timeout (1 hour), but the BE task keeps running as a zombie: it occupies one of the few RELEASE_SNAPSHOT workers and races with re-dispatched vacuums of the same partition for hours, while nobody reads its response. Carry the FE timeout in VacuumRequest.timeout_ms. The BE vacuum handler anchors an absolute deadline when the request is received and threads it through the vacuum execution; the version-chain walk checks the deadline on each iteration and aborts with Status::TimedOut once it passes. The check at vacuum entry also kills tasks that already exceeded the deadline while waiting in the thread pool queue. Requests from older FEs without the field carry no deadline and run to completion as before.
9620972 to
efca032
Compare
Contributor
|
No new undocumented parameters detected by the param-drift check. |
xiangguangyxg
approved these changes
Jun 12, 2026
Comment on lines
+321
to
+324
| // The longest this FE waits for the response (the brpc timeout of the vacuum RPC). | ||
| // The BE checks it periodically during execution and aborts the task once it has | ||
| // elapsed, instead of running on as a zombie that no caller is waiting for. | ||
| vacuumRequest.timeoutMs = LakeService.TIMEOUT_VACUUM; |
Contributor
There was a problem hiding this comment.
better to make vacuum timeout configurable
Contributor
[Java-Extensions Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
Contributor
[FE Incremental Coverage Report]✅ pass : 1 / 1 (100.00%) file detail
|
Contributor
[BE Incremental Coverage Report]✅ pass : 20 / 20 (100.00%) file detail
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why I'm doing:
The FE gives up waiting for a vacuum RPC after its brpc timeout (
LakeService.TIMEOUT_VACUUM, 1 hour), marks the partition vacuum as failed and re-dispatches it shortly after. But the BE-side vacuum task is never cancelled: it keeps running as a zombie, occupying one of the few workers of theRELEASE_SNAPSHOTthread pool (5 by default) for hours, while nobody reads its response. On clusters with partitions that accumulated a huge number of versions, zombie tasks can exhaust the whole pool: newly dispatched vacuum requests pile up in the queue, vacuum throughput collapses cluster-wide, and the version backlog keeps growing.What I'm doing:
optional int64 timeout_ms = 11toVacuumRequest(gensrc/proto/lake_service.proto): the maximum duration the FE caller waits for the request.AutovacuumDaemon#vacuumPartitionImplfillstimeoutMswithLakeService.TIMEOUT_VACUUM, the brpc timeout of the vacuum RPC, so the BE deadline matches exactly how long the FE actually waits.LakeServiceImpl::vacuumanchors an absolute deadline (butil::gettimeofday_ms() + timeout_ms) at the time the request is received and passes it tolake::vacuum(newdeadline_msparameter, default0= no deadline, all other callers unchanged).vacuum_implchecks the deadline once at entry — a task that already exceeded the deadline while waiting in the thread pool queue aborts without doing any work;collect_files_to_vacuumchecks it on each iteration of the version-chain walk (the dominant cost for high-version-count partitions) and aborts withStatus::TimedOut, freeing the worker. Aborting between walk iterations leaves the metadata chain untouched, so the next vacuum round resumes from the same state.lake_vacuum_enable_task_timeout(defaulttrue) gates the deadline: when set tofalsethe BE ignorestimeout_msand vacuum tasks always run to completion.timeout_ms(older FE versions) carry no deadline and run to completion, exactly as before.test_vacuum_deadline_expired_mid_walk(deadline expires mid-walk via a mocked clock on the newvacuum:check_deadlinesync point: nothing is deleted, and a follow-up run without deadline converges normally),test_vacuum_task_deadline_exceeded(handler threadstimeout_msinto the task and returnsTIMEOUT; a request without the field is unaffected, and so is any request whenlake_vacuum_enable_task_timeoutis off), andtestVacuumRequestCarriesTimeout(FE fills the field).Fixes #issue
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check: